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CN Abstract 

Are we as a society getting more polarized, and if so, why? We try to answer this question 
through a model of opinion formation. Empirical studies have shown that homophily results in 
polarization. However, we show that DeGroot's well-known model of opinion formation based 
on repeated averaging can never be polarizing, even if individuals are arbitrarily homophilous. 
We generalize DeGroot's model to account for a phenomenon well-known in social psychology 

\sO as biased assimilation: when presented with mixed or inconclusive evidence on a complex issue, 

individuals draw undue support for their initial position thereby arriving at a more extreme 
opinion. We show that in a simple model of homophilous networks, our biased opinion formation 
process results in either polarization, persistent disagreement or consensus depending on how 
biased individuals are. In other words, homophily alone, without biased assimilation, is not 
sufficient to polarize society. Quite interestingly, biased assimilation also provides insight into the 

i ^ i following related question: do internet based recommender algorithms that show us personalized 

content contribute to polarization? We make a connection between biased assimilation and the 
polarizing effects of some random-walk based recommender algorithms that are similar in spirit 
to some commonly used recommender algorithms. 
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1 Introduction 



The issue of polarization in society has been extensively studied and vigorously debated in the 
academic literature as well as the popular press over the last few decades. In particular, are we 
as a society getting more polarized, if so, why, and how can we fix it? Different empirical studies 
arrive at different answers to this question depending on the context and the metric used to measure 
polarization. 

Evidence of polarization in politics has been found in the increasingly partisan voting patterns 
of the members of Congress [PR84, PR91] and in the extreme policies adopted by candidates for 
political office [Hil09] . McCarty et al. [MPR06] claim via rigorous analysis that America is polarized 
in terms of political attitudes and beliefs. Phenomena such as segregation in urban residential 
neighborhoods ([Sch71, BM06, BIKK12]), the rising popularity of overtly partisan television news 
networks [Bil09, BillO], and the readership and linking patterns of blogs along partisan lines [AG05, 
HG07, GBK09, LSF10] can all be viewed as further evidence of polarization. On the other hand, 
it has also been argued on the basis of detailed surveys of public opinion that society as a whole 
is not polarized, even though the media and the politicians make it seem so [Wol99, FAP05]. We 
adopt the view that polarization is not a property of a state of society; instead it is a property of 
the dynamics of interaction between individuals. 

It has been argued that homophily, i.e., greater interaction with like-minded individuals, results 
in polarization [BHK+96, Sun02, GBK09]. Evidence in support of this argument has been used 
to claim that the rise of cable news, talk radio and the Internet has contributed to polarization: 
the increased diversity of information sources coupled with the increased ability to narrowly tailor 
them to one's specific tastes (either manually or algorithmically through, for example, recommender 
systems) has an echo-chamber effect which ultimately results in increased polarization. 

A rich body of work attempts to explain polarization through variants of a well-known opinion 
formation model due to DeGroot [DeG74]. In DeGroot's model, individuals are connected to each 
other in a social network. The edges of the network have associated weights representing the 
extent to which neighbors influence each other's opinions. Individuals update their opinion as a 
weighted average of their current opinion and that of their neighbors. Variants of this model {e.g., 
[FJ90, KraOO, ACFO10, BKOll]) account for the empirical observation that in many cases there is 
persistent disagreement between individuals and consensus is never reached. However, we show that 
repeated averaging of opinions, which underlies these models, always results in opinions that are 
less divergent compared to the initial opinions, even if individuals are arbitrarily homophilous. As 
a result, this entire body of work appears to fall short of explaining polarization which is generally 
perceived to mean an increased divergence of opinions, not just persistent disagreement. In this 
paper, we seek a more satisfactory model of opinion formation that (a) is informed by a theory of 
how individuals actually form opinions, and (b) produces an increased divergence of opinions under 
intuitive conditions. 

We base our model on a well-known phenomenon in social psychology called biased assimilation, 
according to which individuals process new information in a biased manner whereby they readily 
accept confirming evidence while critically examining disconfirming evidence. Suppose that indi- 
viduals with opposing views on an issue are shown mixed or inconclusive evidence. Intuitively, 
exposure to such evidence would engender greater agreement, or at least a moderation of views. 
However, in a seminal paper, Lord et al. [LRL79] showed through experiments that biased as- 
similation causes individuals to arrive at more extreme opinions after being exposed to identical, 
inconclusive evidence. This finding has been reproduced in many different settings over the years 
{e.g., [MMBD93, MDL + 02, TL06]). We use biased assimilation as the basis of our model of opinion 
formation and show that in our model homophily alone, without biased assimilation, is not sufficient 
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to polarize society. 

1.1 Summary of Contributions 

We propose a generalization of DeGroot's model that accounts for biased assimilation. Like De- 
Groot's model, our opinion formation process unfolds over an exogenously denned social network 
represented by a weighted undirected graph G = (V,E). Each individual i G V has an opinion 
Xi(t) G [0,1], which represents his degree of support at time step t for the position represented by 
1. In order to weight confirming evidence more heavily relative to disconfirming evidence, opinions 
are updated as follows: individual i weights each neighbor j'a opinion Xj(t) by a factor (xi(t)) bi 
and weights the opposing view (1 — Xj(t)) by a factor (1 — Xi(t)) bi , where b\ > is a bias pa- 
rameter. Informally, hi represents the bias with which i assimilates his neighbors opinions. When 
bi = 0, our model reduces to DeGroot's, and corresponds to unbiased assimilation. Our biased 
opinion formation process mathematically reproduces the effect empirically observed by Lord et al. 
(Theorem 1). 

We measure divergence of opinions in terms of the network disagreement index (NDI), which we 
define to be Yl(i,j)eE w ij(. x i(t) ~ x j(t)) 2 - ^ is similar to the notion of social cost used by Bindel et 
al. [BKOll]. We say that an opinion formation process is polarizing if the NDI at the end of the 
process is greater than that initially. We show that: 

• (Theorem 2) DeGroot-like repeated averaging processes can never be polarizing, even if indi- 
viduals are arbitrarily homophilous (i.e., the underlying network is presented adversarially as 
opposed to based on a mathematical model). 

• (Theorem 4) The biased opinion formation process over a simple model of networks with 
homophily results in polarization if individuals' bias parameter b > 1. If b < 1, the process 
results in either persistent disagreement or consensus depending on the degree of homophily. 

In summary, we show that homophily alone, without biased assimilation, is not sufficient to polarize 
society. This conclusion disagrees with the literature (e.g., [BHK + 96, Sun02]) that proposes ho- 
mophily as the predominant cause of polarization. As the reader might expect, there are many ways 
of mathematically measuring the divergence of opinions among individuals. Many of our results 
hold for more general measures of divergence, which we discuss in Section 6. 

The notion of biased assimilation also provides insight into the following related question: do 
internet based recommender algorithms that show us personalized content contribute to polariza- 
tion? We analyze the polarizing effects of three recommender algorithms — SimpleSALSA, Sim- 
plePPR, and SimplelCF — that are similar in spirit to three well-known algorithms from the litera- 
ture: SALSA [LM01], Personalized PageRank [PBMW99], and Item-based Collaborative Filtering 
[LSY03]. For a simple, natural model of the underlying user-item graph, and under reasonable 
assumptions, we show that SimplePPR, which recommends the item that is most relevant to a user 
based on a PageRank-like score, is always polarizing (Theorem 5). On the other hand, Simple- 
SALSA and SimplelCF, which first choose a random item liked by the user and recommend an 
item similar to that item, are polarizing only if individuals are biased (Theorem 6). Designing 
algorithms and online social systems that reduce polarization, for example, by counteracting biased 
assimilation is a promising research direction. 

2 Model 

Our opinion formation process unfolds over a social network represented by a connected weighted 
undirected graph G = (V,E,w). The nodes in V represent individuals and the edges represent 
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friendships or relationships between them. Let \V\ = n. An edge (i,j) G E is associated with a 
weight Wij > representing the degree of influence i and j have on each other. Each individual i G V 
also has an associated weight wu > representing the degree to which the individual weights his 
own opinions. We will denote by N(i) the set of neighbors of i, that is, N(i) := {j G V : G E}. 

An individual i has an opinion Xi(t) G [0, 1] at time step t = 0, 1, 2, The extreme opinions 

and 1 represent two opposing points of view on an issue. So Xi(t) can be interpreted as individual 
i's degree of support at time t for the position represented by 1, and 1 — Xi(t) as the degree of 
support for the position represented by 0. Let x(i) G [0, l] n denote the vector of opinions at time t. 
An opinion formation process is simply a description of how individuals update their opinions, i.e., 
for each individual i G V, it defines Xi(t + 1) as a function of the vector of opinions, x(i), at time t. 



2.1 Measuring Polarization 

We view polarization as a property of an opinion formation process instead of a property of a state 
of the network. We characterize polarization as a verb as opposed to a noun, i.e., we say that an 
opinion formation process is polarizing if it results in an increased divergence of opinions. One could 
mathematically capture divergence of opinions in many different ways. We measure divergence in 
terms of the network disagreement index defined below. 

Definition 2.1 (Network Disagreement Index (NDI)). Given a graph G = (V,E,w) and a vector 
of opinions x G [0, l] n of individuals in V , the network disagreement index rj(G,x) is defined as 

r](G,x):= ^ Wij(xi-Xj) 2 (2.1) 

Consider an opinion formation process over a network G = (V, E, w) that transforms a set of 
initial opinions x G [0, l] n into a set of opinions x' G [0, l] n . Then, we say the process is polarizing 
if r]{G, x') > r](G,x), and vice versa. 

The NDI is similar to the notion of social cost used by Bindel et al. [BKOll]. Each term Wij{xi — 
Xj) 2 can be viewed as the cost of disagreement imposed upon i and j. This view that the social cost 
depends on the magnitude of the difference of opinions along edges is consistent with theories in 
social psychology according to which attitude conflicts in relationships are a source of psychological 
stress or instability [Hei46, Fes57]. The NDI captures the phenomenon of issue radicalization, 
i.e., pre-existing groups of individuals becoming progressively more extreme. Admittedly, it does 
not entirely capture an aspect of polarization called issue alignment [BG08] whereby individuals 
with diverse opinions organize into ideologically coherent, but opposing factions. However, there 
is significant empirical evidence [MPR06, BG08, Casl2] that issue radicalization is more prevalent 
compared to issue alignment, and hence NDI captures the most salient aspects of polarization. 
Many of our results hold for more general measures of divergence which we discuss in Section 6. 

2.2 DeGroot's Repeated Averaging Model 

In his seminal work on opinion formation, DeGroot [DeG74] proposed a model where at each time 
step, individuals simultaneously update their opinion to the weighted average of their neighbors' 
and their own opinion at the previous time step. 

Definition 2.2 (DeGroot's Repeated Averaging Process). The opinion of individual i at time t+1, 
Xi(t +1), is given by 

waxAt) + sAt) 

Xi(t + 1) = y ' — r 1 ^ 2.2) 
wu + di 
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where Si(t) := Y^jeNti) w ij x j(t) ^ s the weighted sum of the opinions of i's neighbors, and di := 
YjeN(i) w ij is *' s weighted degree. 

Recall that Xj(t) and 1 — Xj(t) represent the degree of support for extremes 1 and 0, respectively. 
Then, opinion update under DeGroot's process is equivalent to taking a weighted average of the 
total support for and that for 1. The weight that individual i places on 1 (and on 0) is computed 
by summing the degrees of support of i's neighbors weighted by the influence of each neighbor on i. 



2.3 Biased Opinion Formation Model 

We generalize DeGroot's model to account for biased assimilation. Biased assimilation is a well- 
known phenomenon in social psychology described by Lord et al. [LRL79] in their seminal paper 
as follows: 

People who hold strong opinions on complex social issues are likely to examine relevant empirical 
evidence in a biased manner. They are apt to accept "confirming" evidence at face value while 
subjecting "disconfirming" evidence to critical evaluation, and as a result to draw undue support 
for their initial positions from mixed or random empirical findings. 

Lord et al. [LRL79] showed through experiments that biased assimilation of mixed or inconclusive 
evidence does indeed result in more extreme opinions. 

In order to account for biased assimilation, we propose a biased opinion formation process. Recall 
that Xi(t) can be viewed as the degree of support for the position represented by 1. Individuals weight 
confirming evidence more heavily relative to disconfirming evidence by updating their opinions as 
follows: individual i weights each neighbor fs support for 1 (i.e., Xj(t)) by an additional factor 
(xi(t)) bi , where b% > is a bias parameter. Therefore, Xi(t + 1) oc (xi(t)) bi WijXj(t), Similarly, i 
weights j's support for (i.e., 1 — Xj(t)) by (1 — Xi(t)) bi , and so (1 — Xj(i + 1)) oc (1 — Xi(t)) bi Wij(l — 
Xj(t)). Informally, b{ represents the bias with which i assimilates his neighbors opinions. 

Illustrative example. Consider a graph with two nodes, i and j, connected by an edge with 
a weight w%j. Then, according to the biased opinion formation process, Vs opinion at time t + 1, 
Xi(t + 1), is given by 

X .U + 1) — WuXi(t) + (Xi(t)) b >WijXj(t) 



WU + (Xi[t)) b iWijXj(t) + (1 - Xi(t)) b iWij(l - Xj(t)) 

More generally, the opinion update of individual i in the biased opinion formation process is defined 
as below. 

Definition 2.3 (Biased Opinion Formation Process). Under the biased opinion formation process, 
the opinion of individual i at time t+1, Xi(t+1), is given by 

/,, n = WjiXijt) + (Xj{t)) bi Si(t) 

l{ ' w a + (xi(t)fi Si (t) + (1 - Xi(t))«*(di - Si(t)) { ' ' 

where, as before, s»(t) := YljeNU) w ij x j(t) * s the weighted sum of the opinions of i's neighbors, 
and di := YljeNU) w ij 1S ^ s weighted degree. Observe that when bi = 0, (2.3) is identical to 
(2.2), i.e., DeGroot's averaging process is a special case of our process and corresponds to unbiased 
assimilation. More generally, biased assimilation can be modeled by making i's opinion update 
proportional to f3i(xi(t))si(t), where the bias function : [0, 1] — > [0, 1] is non-decreasing. 

Connection with Urn Models. Urn models are an elegant abstraction that have been used 
to analyze the properties of a wide variety of probabilistic processes. DeGroot's model of weighted 
averaging has the following analogous urn dynamic: Xi(t) denotes the fraction of RED balls in 
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individual i's urn at time t, and 1 — Xi(t) denotes the corresponding fraction of BLUE balls. At each 
time step, i chooses a neighbor j with probability proportional to Wij and chooses a ball uniformly 
at random from j's urn. Individual i adds that ball to his urn and discards a ball chosen uniformly 
at random from his urn. When the bias parameter 6, = 1, the biased opinion formation process can 
be interpreted as the following variant of the above urn dynamic: as before, i chooses a neighbor 
j with probability proportional to Wij and chooses a ball uniformly at random from j's urn. In 
addition, i also chooses a ball uniformly at random from his own urn. If the colors of the two balls 
match, i puts them both into his urn and discards a ball chosen uniformly at random from his urn. 
If the colors do not match, the two balls are returned to their respective urns. 



2.4 Biased Assimilation by a Single Agent in a Fixed Environment 

Here we demonstrate that our model of biased assimilation mathematically reproduces the empirical 
findings of Lord et al. [LRL79]. We analyze the change in opinion of a single individual as a 
function of his bias parameter when he is exposed to opinions from a fixed environment. The fixed 
environment represents sources of information that influence the individual's opinion, but can be 
assumed to remain unaffected by the individual's opinion, such as the news media, the Internet, the 
organizations that the individual is a part of, etc. 

For this section, we will denote by x(t) £ [0, 1] the individual's opinion at time t, and by 
b > the individual's bias parameter. Let the individual's weight on his own opinion, von = w. 
Let s £ (0, 1) denote the (time-invariant) weighted average of the opinions of all sources in the 
individual's environment. Then, from (2.3), the individual's opinion at time t + 1 is given by 

x(t + l) = Wx(t) ± (2 4) 

{ ± ' w + (x(t)) b s + (l-x(t)) b {l-s) 1 ' 

Given s £ (0, 1), and b ^ 1, we define 

s i/(i-6) 

^ h) := a i/(i-») + (1 _ a) i/(i-» (2 ' 5) 

as the polarization threshold for the individual. We show that when the individual is sufficiently 
biased (i.e., b > 1), the polarization threshold x is an unstable equilibrium, i.e., in equilibrium the 
individual's opinion goes to 1 or depending on whether the initial opinion was greater than or less 
than x. On the other hand, when b < 1, x is a stable equilibrium. 

Theorem 1. Fix t > 0. Let x(t) £ (0, 1). 

1. Ifb>\, 

(a) if x(t) > x, then x(t + 1) > x(t), and x(t) — > 1 as t -> oo. 

(b) if x{t) < x, then x(t + 1) < x(t), and x(t) — > as t — > oo. 

(c) if x(t) = x, then for all t' > t, x(t') = x. 

2. Ifb<\, 

(a) if x(t) > x, then x(t + 1) < x(t). 

(b) if x(t) < x, then x(t + 1) > x{t). 

(c) x(t) — > x as t — ^ oo. 
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The theorem is proved in Appendix A. The opinion x(t) can be interpreted as the individual's 
degree of support for the extreme represented by 1. So, the above theorem shows that when the 
individual is sufficiently biased (i.e., b > 1), exposure to the environment pushes him away from 
the threshold x (unless x(0) = x), and toward one of the extremes, and the individual holds an 
extreme opinion (x(t) = or x(t) = 1) in equilibrium. Thus x is an unstable equilibrium. This 
mathematically captures the biased assimilation behavior observed empirically. On the other hand, 
if the individual has low bias (i.e., b < 1), then he gravitates towards the polarization threshold x 
over time. Thus, x is a stable equilibrium in this case. The behavior of the individual when b = 1 
is a limiting case of the two cases proven in the theorem; as b — > 1, x — > s. When the individual is 
connected to other individuals in a social network, we will show that the biased opinion formation 
process produces polarization even when 6 = 1. 

3 DeGroot's Repeated Averaging Process is not Polarizing 

It is easy to see that if DeGroot's process was asynchronous, i.e., individuals update their opinion 
one at a time, each opinion update can only lower the network disagreement index (NDI). However, 
here we will show that each opinion update can only lower the NDI even when individuals update 
opinions simultaneously. As a result, the repeated averaging process is depolarizing. Our result 
holds for arbitrary weights Wij, and an arbitrary vector of opinions x G [0, l] n , i.e., when the 
underlying network is arbitrarily homophilous. 

Theorem 2. Consider an arbitrary weighted undirected graph G = (V,E,w). Assume that G is 
connected. Let x(t) E [0, l] n be an arbitrary vector of opinions of nodes in G at time t > 0. Assume 
that for alii £ V, b% = 0. Then, r/(G,x(t + l)) < rj(G,x(t)), i.e., the network disagreement index at 
time t + 1 is no more than that at time t. 

The theorem is proved in Appendix B. Observe that in the limit as wu — > oo, individual i can 
be viewed as being a zealot [YAO + ll],i.e., an individual with an unchanging opinion. So our result 
also holds for repeated averaging in the presence of zealots. 

A possible criticism of this result is that it holds for this particular definition of the NDI which 
may not always capture the intuitive notion of polarization. For example, consider a network 
partitioned into two densely connected opposing factions with sparse cross linkages. One might 
consider such a network to be polarized, even though the network disagreement index for it is 
small. An alternate measure that does capture the divergence of opinions in the above example is 
the global disagreement index (GDI) defined below. 

Definition 3.1 (Global Disagreement Index (GDI)). Given a vector of opinions x E [0, l] ra of 
individuals in V, the global disagreement index 7(x) is defined as 



Observe that it is possible to assign edge weights Wij such that DeGroot's repeated averaging 
process increases the GDI since the latter is independent of the weights. However, we show that a 
variant of repeated averaging, based on the well-known flocking model for decentralized consensus 
[Tsi84], can only decrease the GDI. We consider a repeated averaging process where at each time 
step t > 0, an arbitrary set S(t) C V of individuals simultaneously updates their opinions to be 
closer to the average opinion of the set. 




(3.1) 



i<j 
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Definition 3.2 (Flocking Process). Let e G [0, 1]. For t > 0, let S(t) C V be an arbitrary set of 
individuals. Let s(t) := jgjjy\ X^ies(t) x i(t) be the average opinions of individuals in S(t). Under the 
flocking process, the opinion of individual i £ V at time t+1, Xi(t + 1), is given by 

(t + 1) = ( (l-e) Xi (t) + es(t) ifieS(t) 
[ Xi(t), otherwise 

Next we show that each opinion update in the nocking process can only lower the GDI. 

Theorem 3. Let x(t) G [0, l] n be an arbitrary vector of opinions of nodes in V at time t > 0. Let 
x(£ + 1) € [0, l] n be the vector of opinions at time t + 1 after one step of the flocking process. Then, 
7(x(i + 1)) < 7(x(t)) ; i.e., the GDI at time t + 1 is no more than that at time t. 

The theorem is proved in Appendix B. 



4 Polarization due to Biased Assimilation 

In this section we state and prove our main result: in a simple model of networks with homophily, 
the biased opinion formation process may result in either polarization, persistent disagreement, or 
consensus depending on how biased the individuals are. We model homophilous networks using a 
deterministic variant of multi-type random networks [GJ11]. Multi-type random networks are a gen- 
eralization of Erdos-Renyi random graphs. Nodes in V are partitioned into types, say, ti, T2, . . . , r^. 
The network is parameterized by a vector (n\, . . . , n^) where is the number of nodes of type Tj, 
and a symmetric matrix P G [0, l] fcxfc ; where Pij is the probability that there exists an undirected 
edge between a node of type n and another of type Tj. The class of multi-type random networks 
where Pa > Pij for all i,j, is called is the islands model, and is used to model homophily (since an 
individual is more likely to be connected with individuals of the same type). We will analyze the 
biased opinion formation process over a deterministic variant of the islands model, which we call a 
two-island network. 

Definition 4.1. Given integers n\,n<i > 0, and real numbers p s ,Pd G (Oil); a ( n ii n 2,Ps,Pd)-two 
island network is a weighted undirected graph G = {V\, V2, E, w) where 

• |Vi| = m, IV2I = «2 and V\ D V2 = 0. 

• Each node i G V\ has n\p s neighbors in V\ and n^Pd neighbors in V%. 

• Each node i G V2 has U2p s neighbors in V2 and n\pd neighbors in Vi 1 . 

• Ps > Pd- 

For a two-island network, we define the degree of homophily as follows. 

Definition 4.2. Let G = (Vi, V2, E, w) be a (ni,n2,p s ,Pd)-two island network. Then the degree of 
homophily in G, he, is defined to be the ratio p s /Pd- 

Informally, a high value of he implies that nodes in V are much more likely to form edges to 
other nodes of their own type, thereby exhibiting a high degree of homophily. 

Theorem 4. Let G = (Vi, V2, E, w) be a (n,n,p s ,pd)-two island network. For alii GV = Vi U V2, 
let wu = 0. For all (i,j) G E, let Wij = 1. Assume for all i G V\, X{(0) = x$ where g < Xq < 1. 
Assume for all i G V2, Xi(0) = 1 — xq. Assume for all i G V , the bias parameter b{ = b > 0. Then, 

For clarity of exposition, we assume that the quantities nip s ,n2p s ,nipd and n2Pd are all integers. 
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Algorithm 1 SimpleSALSA 

Input: G = (V 1 ,V 2 ,E), node i G V[. 
Perform a three-step random walk on G starting at i. 
Let the random walk end at node j G V2. 
Output: j. 



1. (Polarization) Ifb>l, Vi G Vi, limt-xx, Xi(t) = 1, andVi G V2, lim^oo = 0. 

(Persistent Disagreement) if 1 > b > ^ ? i/ien £/tere exists a unique x G (^,1) suc/i i/tai 
Vi G Vi, lim^oo ccj(t) = x ; and Vi G V 2 , lim t _>.oo = 1 — x. 

3. (Consensus) if b < h( ? +1 , then for all i G V, lim^oo Xi(t) = |. 

The theorem is proved in Appendix C. Let us analyze the implications of this theorem. Let 
rj{G, x(i)) — > rjoo as i — > 00, i.e., let 770c be the NDI at equilibrium. Then, the above result implies 
that when b > 1, ryoo > r/(G, x(0)), i.e., the biased opinion formation process is polarizing. On the 
other hand, when individuals are moderately biased (i.e., 1 > b > 2/ (he + 1)), ?/oo > v(G, x(0)) 
if and only if xq < x; so the opinion formation process may not be polarizing, but it doesn't 
produce consensus either. Finally, when individuals have low bias (i.e., b < 2/ {he + 1), r/00 = < 
ry(G,x(0)), i.e., the opinion formation process is depolarizing, since the network reaches consensus 
in equilibrium. 

This illustrates the importance of the bias parameter in causing polarization. Also, observe that 
6=1 corresponds to the urn dynamic described in Section 2.3, and hence the above result shows 
that that urn dynamic causes polarization for arbitrarily small degree of homophily. 

5 Recommender Systems and Polarization 

Recommender systems are widely used on the Internet to present personalized information {e.g., 
search results, new articles, products) to individuals. This personalization is typically done by 
algorithms that use an individual's the past behavior {e.g., history of browsing and purchases) and of 
other individuals that are similar in some way to that individual, to discover items of possible interest 
to the user. It has been argued [Sun02] that this personalization of information has an echo-chamber 
effect where individuals are only exposed to information they agree with, and this ultimately leads 
to increased polarization. In this section we investigate this question: do recommender systems 
have a polarizing effect? We analyze three simple random-walk based recommender algorithms — 
SimpleSALSA (Algorithm 1), SimplePPR (Algorithm 2) and SimpleICF(Algorithm 3) — that are 
similar in spirit to three well-known recommender algorithms from the literature: SALSA [LM01], 
Personalized PageRank [PBMW99], and item-based collaborative filtering [LSY03], respectively. 

We consider the following simple model: Let G = {V\,V2,E) be an unweighted undirected 
bipartite graph. Nodes in V\ represent individuals. Nodes in V2 represent items. The items could 
be books, webpages, news articles, products, etc. For concreteness, we will refer to nodes in V% as 
books. For a node i G V\ and a node j G V2, an edge (i, j) G E represents ownership, i.e., individual 
i owns book j. For our purpose, we define a recommender algorithm as below. 

Definition 5.1. A recommender algorithm takes as input a bipartite graph G = {V±,V2,E) and a 
node i G V\ , and outputs a node j G V2 ■ 

Thus, given a graph representing which users own which books, and a specific user i, a recom- 
mender algorithm outputs a single book j to be recommended to i. We assume that i can only buy 
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Algorithm 2 SimplePPR 

Input: G = (V 1 ,V 2 ,E), node i i V[. 

Parameter: A large positive integer T. 

Perform T three-step random walks on G starting at node i. 

For node j G V2, let count (j) be the number of random walks that end at node j. 
Output: j* := argmaxj count (j). 



a book if it is recommended to him. However, he may choose to reject a recommendation, i.e., to 
not buy a recommended book. Therefore, i buying a book j requires two steps: the recommender 
algorithm must recommend j to i, and then i must accept the recommendation. 

Since, we are interested in analyzing the polarizing effects of recommender systems, we will 
assume that each book in V2 is labeled either 'RED' or 'BLUE'. These labels are purely for the purpose 
of analysis; the algorithms we study are agnostic to these labels. For each individual i € V\, let 
X{ £ [0, 1] be the fraction of RED books owned by i, and 1 — Xi be that of BLUE books. Individuals 
may be biased, or unbiased, as we define below. 

Definition 5.2. Consider a book recommended to an individual i G V\. We say that i is unbiased 
if i accepts the recommendation with the same probability independent of whether the book is RED or 
BLUE. We say that i is biased if 

1. i accepts the recommendation of a RED book with probability x%, and rejects it with probability 
1 — Xi, and 

2. i accepts the recommendation of a BLUE book with probability 1 — X{, and rejects it with prob- 
ability Xi. 

Observe that the above definition of an individual i being biased corresponds to the urn dynamic 
described in Section 2.3 with bi = 1. For an individual i, the fraction of RED books i owns, x^, can 
be viewed as i's opinion in the interval [0,1], and so a recommender algorithm can be viewed as 
an opinion formation process. The opinion Xi remains unchanged if i rejects a recommendation. 
However, if i accepts a recommendation, Xi increases or decreases depending on whether the recom- 
mended book was RED or BLUE. Thus, we are interested in the probability that a recommendation 
was for a RED (or BLUE) book given that i accepted the recommendation. The above probability 
determines whether a recommender algorithm is polarizing or not. 

Definition 5.3. Consider a recommender algorithm and an individual % & V\ that accepts the 
algorithm's recommendation. The algorithm is polarizing with respect to i if 

1. when Xi > 2! the probability that the recommended book was RED is greater than Xi, and 

2. when Xi < \, the probability that the recommended book was RED is less than Xi. 

In order to analyze the recommender algorithms, we assume a generative model for G, which 
we describe next. 

5.1 Generative Model for G 

Let the number of individuals, \Vi\ = m > 0. Let the number of books, | V2 1 = 2n, with n > books 
of each color. We assume that m = f(n); and lim ra _ s . 00 f(n) = 00. For each individual % E V\, we 
draw Xi independently from a distribution over [0, 1] with a probability density function (pdf) g(-). 
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Algorithm 3 SimplelCF 

Input: G = (V 1 ,V 2 ,E), node i G V[. 
Parameter: A large positive integer T. 
Choose a neighbor k of i uniformly at random. 
Perform T two-step random walks on G starting at k. 

For node j G V%, let count (j) be the number of random walks that end at node j. 
Output: j* := argmaxj count (j). 



We assume that g is symmetric about |, i.e., for all y G [0, 1], g(y) = g(l — y). This implies that 
for all i G V±, E[xj] = |. We assume that the variance of the distribution is strictly positive, i.e., 
Var(xj) > 0. For an individual i and a RED book j, there exists an edge G E independently 
with probability where < k < n. For an individual i and a BLUE book j, there exists an 
edge (i,j) G £7 independently with probability , So, in expectation, each individual i owns 

books, and X{ fraction of them are RED. 

For two books j,f G V 2 , let Mjji := \N(j) n iV(j')| be the number of individuals in V\ that are 

neighbors of both j and j' in G. For any two nodes i,j G V, let P[i A j] be the probability that a 
^-step random walk over G starting at i ends at j. For a node i G Vi and a node j £ V2, let Zjj be 
the indicator variable for edge (i,j),i.e., = 1 if (i,j) G E, and = otherwise. 

5.2 Analysis 

Next we prove our results about the polarizing effects of each of the three algorithms. Our results 
hold with probability 1 in the limit as n — > 00. First we invoke the Strong Law of Large Numbers 
to show that the random quantities we care about all take their expected values with probability 1 

as n — > 00. 

Lemma 5.1. In the limit as n — > 00, with probability 1, 

(a) for allie Vi, \N{i)\ -> k, 

(b) for allieVi, J2 jiev 2 z m ~* x i k > 

jl is RED 

(c) for allieVi, £ jlE v 2 Z ij2 ->• (1 - Xi)k, 

]2 is BLUE 

(d) forallj€V 2 , \N(j)\^^, 

(e) for every pair of RED books j,f G V2,Mj? = ^2 ieVl ZijZ^ 



mk 2 ( \ + Var{x\ ) ) 



(f) for every pair of BLUE books j,f G Vi,Mjji = ^2 ieVl ZijZ^ -)■ mfc ^ + J ar ^ xi ^\ and 



(g) for every RED book j and every BLUE book j' , Mjji = YlieV ^ij^ij' ~" ' 



mk 2 (j — Var(xi)) 



Proof. Recall that as n — > 00, m = f(n) — > 00. So statements (a) through (g) follow from the 
Strong Law of Large Numbers. □ 

We use Lemma 5.1 to prove our results. First we show that SimplePPR (Algorithm 2) is 
polarizing with respect to i even if i is unbiased. 

Theorem 5. In the limit asn->oo and as T — > 00, SimplePPR is polarizing with respect to i. 
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Next we show that SimpleSALSA and SimplelCF are polarizing only if i is biased. 
Theorem 6. In the limit as n — )• oo, 

1. SimpleSALSA is polarizing with respect to i if and only if i is biased. 

2. In the limit as T — > oo, SimplelCF is polarizing with respect to i if and only if i is biased. 
Both Theorem 5 and Theorem 6 are proved in Appendix D. 

6 Discussion of Various Measures of Opinion Divergence 

Recall that we define an opinion formation process to be polarizing if it results in an increased 
divergence of opinions. Here we describe a number of alternate measures of divergence, and discuss 
how many of our results hold for these measures. A generalization of the global disagreement index 
(GDI) is the following: X^<j — x jl)> where h is an arbitrary convex function. The flocking 
process has the property that the vector x(t + 1) is majorized by x(i). Therefore, as noted in the 
proof of Theorem 3, each opinion update of the flocking process is depolarizing under this definition, 
or more generally, when divergence is defined by any symmetric convex function of x. 

A stronger definition of divergence is one based on second order stochastic dominance, which is 
defined over distributions, but can be easily modified to work with vectors. Informally, a distribution 
F is second order stochastically dominated by a distribution G if F is a mean-preserving spread of G. 
Let us say an opinion formation process is polarizing if the final opinion vector is dominated (second 
order stochastically) by the initial opinion vector, and is depolarizing if the final vector dominates 
the initial vector. According to this definition, a single opinion update in the DeGroot and flocking 
processes is in general neither polarizing nor depolarizing. However, both these processes have been 
shown to converge to consensus under fairly general conditions ([DeG74, Tsi84]). Thus, under those 
conditions, both these processes are depolarizing in equilibrium. Moreover, our results on the three 
recommender algorithms (Theorem 5 and Theorem 6) also hold under this definition of divergence. 

Consider the following even stronger definition of polarization: a process is polarizing if at each 
time step, it pushes the opinions of individuals away from the average and is depolarizing if it brings 
their opinions closer to the average. Under this definition too, the DeGroot and flocking processes 
are neither polarizing nor depolarizing. However, under all three definitions, the biased opinion 
formation process is polarizing on a two-island network when b > 1. 

7 Conclusion 

In this paper we attempted to explain polarization in society through a model of opinion formation. 
We generalized DeGroot's repeated averaging model to account for biased assimilation. We showed 
that DeGroot-like repeated averaging processes can never be polarizing, even if individuals are 
arbitrarily homophilous. We also showed that in a two-island network, our biased opinion formation 
process may result in either polarization (if b > 1), persistent disagreement (if 1 > b > 2/(/i + 1)), 
or consensus (if b < 2/(/i+ 1)). In other words, homophily alone, without biased assimilation, is 
not sufficient to polarize society. We used biased assimilation to provide insight into the polarizing 
effects of three recommender algorithms: SimpleSALSA, SimplePPR and SimplelCF. We showed 
that for a simple, natural model of the underlying user-item graph, SimpleSALSA and SimplelCF 
are polarizing only if individuals are biased whereas SimplePPR is polarizing even if individuals are 
unbiased. 
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One direction for further investigation is to study through human subject experiments how the 
degree of homophily and the strength of biased assimilation affect whether individuals interacting 
over a network polarize or arrive at a consensus? Our analysis of recommender algorithms is 
a first step toward designing algorithms and online social systems that counteract polarization 
and facilitate greater consensus between individuals over complex and vexing social, economic and 
political issues. We view this as a promising and important direction for further research. 
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A Proof of Theorem 1 

Recall that 

= wx(t) + (x(t)) b s 

{ w + {x{t)) b s + (l-x(t)f{l- s) 

Equivalently, 

x(t+l) _ wx{t) + (x(t)) b s _ w + (x(t)) fe_1 s x(t) 

l-x(t + 1) ~ w(l - x(t)) + (1 - x(t)) b (l - s) ~ w + (1 - xlj^Hl ~ s) 1 - x(t) 

First we will show that if x(t) = x, then for all t' > t, x(t') = x. 
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Lemma A.l. Assume b ^ 1. Fix t > 0. Let x(t) = x. Then for all if > t,x(t') = x. 
Proof. To prove the lemma, it suffices to show that x(t + 1) = x(t) = x. Recall that 

s i/(l-6) 



x := 



,1/(1-6) + (1 - s)V(l-6) 

Or equivalently, 



1-6 



1 — x ) 1 — s 

This implies that when x(t) = x, x(t) b ~ 1 s = (1 — x(t)) b ~ 1 (l — s). Substituting this in (A.l), we get 
that 

x(t + l) _ x(t) 
l-x(t + 1) ~ 1 - x(t) 

Or equivalently, x(t + 1) = x(t). □ 

Next we will show that when b > 1, x is an unstable equilibrium. 
Lemma A. 2. Let b > 1. Fix t > 0. 

1. Lfx(t) > x, then x(t + 1) > x(t). 

2. Lf x(t) < x, then x{t + 1) < x(t). 
Proof. Again, recall that 

1-6 



1 — x I 1 — s 



Therefore, if x(t) > x, it implies that 



X{t \ , >^^( X{t \T h < ( - A ^] 1 ' b = — (since b > 1) 
l-x(t) l-x \l-x(t)J l-s K ' 

Or equivalently, {x(t)) h ~ l s > (1 — x(t)) b ~ 1 (l — s). Substituting this in (A.l), we get that 

x(t + l) ^ x(t) 



l-x(t + l) l-x(t) 

Or equivalently, x(t + 1) > x(t). 

By a similar argument, if x(t) < x, then(x(t)) fe_1 s < (1 — x(t)) b_1 (l — s). Again, substituting 
this in (A.l), we get that 

x(t + l) < x(t) 



l-x(t + l) l-x(t) 

Or equivalently, x(t + 1) < x(t). □ 

Next we will show that when b > 1, either lim^oo x(t) = 1 or lim^oo x(t) = 0. 
Lemma A. 3. Let b > 1. Fix t>0. 

1. If x(t) > x, then limt->oo %(t) = 1. 

2. If x(t) < x, then limt-5.00 x(t) = 0. 
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Proof. For the proof, we will assume that x(t) > x and show that lim^oo x(t) = 1. The case when 
x(t) < x can be argued in an analogous way. 

By definition, we know that for all t > 0,x(t) £ [0,1]. Further, from Lemma A. 2, we know 
that the sequence {x(t')t'>t} is strictly increasing. Since the sequence is strictly increasing and 
bounded, it must converge either to 1 or to some value in the interval [x(t),l). Consider the 
function g : [0, 1] — > K defined as 

w + y h s 

g{y) ■-- 



w + y b s + (l-y) b (l-s) 
Observe that for all t > 0, x(t + 1) — x(t) = g(x(t)). Therefore, 

(a) for all y £ [x(t), 1), g(y) > (since, by Lemma A. 2, the sequence {x(t') t '^t} is strictly increas- 
ing), and 

(b) g(l) = 0. 

For the purpose of contradiction, assume that limt_ s>00 x(t) = a, where x(t) < a < 1. This implies, 
for every e > 0, there exists a t(e) such that for all t' > t(e), x(t' + 1) — x(t') < e, or equivalently, 
that for all t? > t(e), g{x(t')) < e. 

Let min^gra,^) ^ g(y) = c. It implies for all y £ [x(t), a], g(y) > c. From (a), it follows that c > 0. 
Setting e = c, our analysis implies the following two properties of g: (1) for all t > 0,g(x(t)) > c, 
and (2) for all t' > t(e), g(x(t')) < c, which contradict each other. This completes the proof by 
contradiction. □ 

Using a similar argument we can show that when b < 1, x is a stable equilibrium. 
Lemma A. 4. Let b < 1. Fix t > 0. 

1. If x(t) > x, then x{t + 1) < x(t). 

2. If x(t) < x, then x(t + 1) > x{t). 
Lemma A. 5. Let b < 1. Then, lim^oo x(t) = x. 

B Proofs of Section 3 

Proof of Theorem 2. Recall that since b% = 0, the opinion of node i at time t + 1 is given by 



i(* + 1 ) = ZT^T 2 



(B.l) 

wu + di 

where recall that di := Y^jeNU) Wi i ls ^ e weighted degree of node i. Let Lq be the weighted 
laplacian matrix of G. Recall that Lq is given by 

{di, Hi = j 
-Wij, H(i,j)eE 
0, otherwise 

Now consider the vector LQX.(t). The ith entry of the vector is given by 

(L G x(t))j = diXi(t) - ^2 WijXj(t) = diXi(t) +wuXi(t) - [waXi(t) + ^ WijXj(t) 
jeN(i) \ j&N{i) 

= {di + w it )(xi(t) - Xi {t + 1)) (from (B.l)) 
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Equivalently, in matrix notation, 

x(t + 1) = (J - DL G )x{t) (B.2) 

where, D is a diagonal matrix such that Da = l/(di + itfjj). Note that since G is connected, dj > 0, 
and therefore Da is finite. Consider the difference r/(G,x(t + 1)) — r)(G,x(t)). Observe that for a 
vector y £ [0, l] n , rj(G,y) = y T L G y. Therefore, we have that 



r,(G,x(t + l))-rj(G,x(t)) 



x(t + l)) T L G (x(t + 1)) - (x(t)) T L G x(t) 

x(i)) T (J - DL G ) T L G (I - DL G )x(t) - (x(i)) T L G x(i) (from (B.2)) 
x(f)) T ((L G - LqDLq)(I - DL G ) - Lq) x(t) (since L G is symmetric) 
x(i)) T (L G - L G DL G - L G DL G - L G DL G DL G - L G ) x{t) 
x(t)) T (L G DL G DL G - 2L G DL G ) x(t) 

x(t)) T L G D 1/2 ((D 1/2 L G D 1/2 - 2I))D 1/2 L G x{t) (since L G is symmetric) 

= y J (D 1 / 2 L G D 1 / 2 - 2J)y (where y := D 1 / 2 L G x(t)) 

Thus, in order to show that rj(G,x(t + 1)) — 7](G,x(t)) < 0, it suffices to show that for all vectors 
y G R n , y 1 D l l 2 L G D 1 / 2 y < 2||y|||. We prove this as Lemma B.l. □ 

Lemma B.l. Consider an arbitrary weighted undirected graph G = (V,E,w) over n nodes. Let 
L G be the weighted laplacian matrix of G. Let D be an n x n diagonal matrix such that for i = 
1, . . . , n, Da = l/(di + wu), where d% = J2jeN(i) w v * s ^ e weighted degree of i in G. Let y G W 1 be 
an arbitrary vector. Then, y T D 1 l 2 L G D 1 l 2 y < 2 1 1 y 1 1 2 - 
Proof. For * = 1, . . . , n, let n := d { + Let P := D 1 / 2 L G D 1 / 2 . Then, 



Pi 



'j 



1 = j 

otherwise 



Then, we have that 

id 



J2Puy 2 + 2 Y PijViVj 

(i,i)6E 



i=l 



Edi 2 
1 1 



* E 



w. 
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rrir] 



ViVj 




w 



1.1 



ViVj 



j 



w 



nr 



■ViVj 



*' 3 



< 



E 

(i,j)eE 

E > 

(i,j)&E 



Wij 



y_i_ 



Vi 



Vi 



+ 



' 1 

i 

+ 2 y 2 (since c?i < r. 



<2||y| 
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Proof of Theorem 3. Let \S(t)\ 
be written in matrix form as 



□ 

k. Then, the opinion update (3.2) under the flocking process can 



x(t + l) 

where A(t) is a n x n matrix given by 

Aij{t) = 



(1 - e)x(t) + eA(t)x(t) 



l 

k ■ 
1, 



if *€ G S(t) 

if i = j and z ^ S(t) 
otherwise 



Observe that A(t) is doubly-stochastic. Then 



7(x(i + 1)) = 7((1 - e)x(i) + eA(i)x(t)) (by definition of x(t + 1)) 

< (1 — e)7(x(t)) + e / j(A(t)x(t)) (since 7 is convex in x) 

< (1 - e)7(x(t)) + 67(x(i)) (by Proposition B.l) 
= 7(x(t)) 

Proposition B.l. 7(A(t)x(t)) < 7(x(t)). 

Proof. Let y := A(t)x(i). Since A(t) is doubly stochastic, it follows by a famous theorem by Hardy, 
Littlewood and Polya, that x(t) majorizes y. Moreover, 7(x) is a convex symmetric function. 
Therefore, it is a Schur-convex function. By definition, a function / : W 1 — > R is Schur-convex if 
/( x i) > /( x 2) whenever xi majorizes X2. Therefore, 7(y) < 7(x(t)). □ 

□ 



C Proof of Theorem 4 

To prove the theorem, we begin by making three simple observations that hold for all b > 0. The 
first observation follows directly from the symmetry of nodes in each set V\ and V2. 

Lemma C.l. Consider nodes i,j G V such that either both i, j £ V\ or both i,j £ V2. Then for all 
t > 0, Xi(t) = Xj(t). 

The next observation allows us to focus on only analyzing the equilibrium opinion of nodes in 

Vl 

Lemma C.2. Consider a node i € V\ and a node j G V%. Then, for all t > 0, Xi(t) = 1 — Xj(t). 

Proof of Lemma C.2. By induction. 

Induction hypothesis: Assume that the statement holds for some t > 0. 

Base case: The statement holds for t = by assumption in the theorem statement. 

We will now show that the statement holds for t+1. 

Xj{t+l) = {Xj{t)) b Sj{t) 

1-Xi(t + T) (1 — Xi{t)) b di — Si(t) 1 ' ' 



19 



where d{ = n(p s + pd) and, by Lemma C.l, Sj(i) = n(p s Xi(t) + Pd,Xj(i)). On the other hand, 

1 - (t + 1) (1 - xj (i)) b dj - sj (t) 



(C.2) 



where Sj(t) = n(p s Xj(t) + PdXi(t)), and dj = n(p s + Pd) = d, L . By the induction hypothesis, we know 
that Xi(t) = 1 — Xj(t). It follows that Si(t) = di — Sj(t). Substituting this into (C.l), we get 

X f (t+1) _ (Xi(t)) b Si (t) _{l-Xj(t)f dj- Sj{t) _1-Xj{t+1) 



1- Xi (t + 1) (1 — x,i(t)) b di — Si(t) (xj{t)) b Sj{t) Xj(t + 1) 

where the last equality follows from (C.2). It follows that Xi(t + 1) = 1 — Xj(t + 1). 

This completes the inductive proof. □ 

Lemma C.2 implies that if we prove the theorem statement for nodes in Vi, we get the proof for 
nodes in V2 for free. So, in the rest of the proof, we only make statements about nodes in V\. The 
third observation lower bounds the opinions of nodes in V\ . 

Lemma C.3. Consider a node i G V\. For all t > 0, Xi(t) G [|, 1]. 

Proof of Lemma C.3. It is easy to see that for all t > 0, Xi(i) < 1. We will prove that scj(t) > | by 
induction over i. 

Base case: The statement holds for t = by assumption in the theorem statement. 
Induction hypothesis: Assume that the lemma statement holds for some t > 0, i.e., assume that 
Xi{t) > \ for some t > 0. 

We will show that the lemma statement holds for i + 1 . 

X;(i+1) (x,(t)) 6 



1-Xiit+l) (1 - Xi{t)f di - Si {t) 

(x(t)) b 

> (i _a;.( t ))6 ( since > di " s ^ 

> 1 (since Xi(t) > - by the induction hypothesis, and b > 0) 

This implies Xj(i + 1) > |, completing the inductive proof. □ 
Recall that i's opinion at time t + 1 is given by 

+ 1) = 7 ^ ^T^4t^7 i 7TT (by (2.3)) 

U ' (Xi(t)) fe Si (t) + (1 - Xi(i)) 6 (^ - K " 

where Si(t) = n(p s Xi(t) — Xj(i))), and di = n(p s + Pd)- Now consider the equation 

Xi (t + 1) = Xi {t) (C.3) 

We will show that if b > 1 or b < feg 2 +1 , (C.3) has no solution in (|, 1), whereas if 1 > b > h ^ + i , 
there exists a unique solution to (C.3) in (~, 1). 

Lemma C.4. Consider a node i G V%. Fix t > 0. 

(V //6 > 1, /or every Xj(t) G (|, 1), x,(t + 1) > Xj(i). 

(7>j If 1 > 6 > feg 2 +1 , i/iere exisis a unique solution, say x, to Eq.(C3) in 1). 
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( c ) !f b < hi+f, for every Xi(t) G {\, 1), Xi(t + 1) < 

Proof of Lemma C4- Consider the function / : [0, 1] — >• K denned as 



f(y;b) 



1, yG [0,1], 6=1 
0, ye[0,l],b = 2 
1-1, y = i6>0 (C- 4 ) 



fa) 2 -"-(i-i/) 2 - i ' 



6 , ' y ~ 2 

otherwise 



We will first prove a few properties of / and then use those properties to prove Lemma C.4. 
Proposition C.l. 1. For all b > 0, f is continuous over [0, 1]. 

2. If < b < 1, f is strictly increasing over [A, 1]. 

3. If b> 1, for ally G [0, 1), f(y; b) < 1. 

Proof. 1. Observe that / is continuous when b = 1 or b = 2. So, we only need to show that 
/ is continuous at y = \ when 6^1 and b ^ 2. Let p(y;b) := (y) 2 ~ b — (1 — y) 2 ~ b and 
q(y; b) := y(l — y) 1 ~ b — y 1_6 (l — y). Observe that when b ^ 1 and 6^2, both p and (/ are 
differentiable on [0, 1]. For y G [0, 1], 

p'(y;b) = (2-6)(y 1 - b +(l-y) 1 - fc ); (? '(y;6) = (l-y)^ -(l- b )y{l-y)- b -(l-b)y- h {l-y)+y l - b 
Therefore, 

j. = lim (2-6)(^ + (l-^) = 2 _ 

y ^i/2q'(y;b) y-+i/2 (1 - y) 1 -** - (1 - %(1 - y)~ fe - (1 - 6)y- b (l - y) + y 1 ^ b 

(C.5) 

So, we have that 

lim f(y;b) = lim ^ = lim ^ ^\ (using L'Hopital's rule) = - — 1 (from (C.5)) = /(- 

v-n/2 q(y;b) y-n/2tf(y) b 2 

Therefore, when b ^ 1 and b 7^ 2, / is continuous at ^. 

2. Assume < b < 1. Fix 2/1,2/2 G [^, 1] such that yi > y 2 . We will show that /(yi; b) > /(y 2 ); 
For conciseness of expression, define y~\ := 1 — y\ and 2/2 := 1 — 2/2- Then 

2/12/2 - 2/12/2 > (yw2) 1 ~ b - (yim) 1 ^ (C.6) 

Similarly, 

y"iy2 - yiz/2 > (yw2) 1 ~ b - {mm) 1 ' 1 ' (C.7) 

Adding (C.6) and (C.7), we get 

2/12/2 - 2/12/2 + 2/12/2 - 2/12/2 > (yiy2) 1-6 - (yiylO 1-6 + (yly2) 1-6 - (yiysO 1 " 6 

Or equivalently, 

(yiy2 - yiy 2 ) - ((yiy2) 1_b - (Sifc) 1-6 ) > (vm - vm) - [(ym) 1 ^ - (yiy 2 ) 1-6 ) (C.8) 

Moreover, since yi,y2 G [5, 1] and yi > y2, 
yiy2 - yiy~2 > 0; (yiy2) 1-6 - (yiyi) 1-6 > 0; yiy 2 - yiy 2 > 0; (yiy^) 1 " 6 - (yiy2) 1-6 > (C.9) 
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(C.8) and (C.9) imply that 



ym - yiV2 {ym) 1 h - (yw2) 



l-b 



2/12/2 — 2/iz/2 ' {ym) 1 b - (ym) 1 b 

Rearranging, we get 

(yi) 2 - b - yi 2 - b .> {y2?~ b - y~2 2 - b M 

— r=&^ = f(y^ b ) > —YT b — i=b= = 

ym 1 6 -2/1 2/1 2/22/2 1 6 -y 2 2/2 

3. Since / is symmetric about y = |, we will prove the theorem for y G 1). Fix y G 1). 
Observe that when b > 1, (1 — y) 1 ^ > y 1 ^ 6 (since y > 1 — y). Equivalently 

2/(1 - 2/) 1 " 6 > 2/ 2 -" (CIO) 

For the same reason, 

y l - b {l-y)<{l-yf- b (C.ll) 
From (CIO) and (C.ll), it follows that 

y(l - y) 1 ^ - y^l -y)> (y) 2 ~ 6 - (1 - yf- b 

or equivalently, f(y; b) < 1. 

□ 

Using these properties of / we will prove Lemma C.4. 

1. If b > 1, then for all y G [0,1), f(y,b) < 1 (by Proposition C.l) < Kq. Therefore, for 



ye 

(y) 2 - b -(l-y) 



2' -V' 

2-6 fl „A2-6 



y(l _ y )l-6 _ yl-6(l _ y) 

& y 2 - b - (1 - y) 2 - fe < h G (y(l - y) 1 ^ - y^l - y)) 
& y 2 - b + hcy^il - y) < (1 - yf- b + h G y(l - y) 1 '" 
& y x - h {y + (1 - y)h G ) < (1 - y) 1 - fo ((l - y) + /> G y) 

^ y / y V (i - y) + h G y 
i-y \ 1_ y/ y + (!-y)^G 

For y = Xj(i), the right hand side of the last inequality above is equal to Xj(t+1)/(1 — Xi(t+l)), 
implying that Xi(t + 1) > Xi(t). 

2. If 1 > b > h 2 +1 , then observe that f(\;b) = | — 1 < Kg < /(!;£>) = °o. Since / is a 
continuous function (by Proposition C.l), therefore, by the intermediate value theorem, there 
must exist a y G 1) such that f(y; b) = he- Equivalently, 

(y) 2 ~ b - (1 - yf- b 

— tlQ 



y(l_y)l-6_yl-6 (l _ y) 

Rearranging the above expression, we get 

y ( y \ b {^-y) + hcy 



i-y \ l -yJ y + {^-y)hc 

Again, for y = Xi(t), we have that Xi(t + 1) = X{(t). The uniqueness of x follows from the fact 
that, by Proposition C.l, / is strictly increasing over 1]. 
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3. If b < j^j, then for all y € [|, 1], /(?/; 6) > /(|; 6) (by Proposition C.l) = f - 1 > h G . In 
other words, 

(yg-6 - (1 - y)2~b 

y(l- y y-b-yl-b {1 _ y) > G 
Again, rearranging the above expression, we get 

y ^ f y \ b (i - y) + ^cy 



i-y > V 1 -yy' y+0--y)hG 

Again, for y = Xi(t), the right hand side of the last inequality above is equal to Xi(t + 1), 
implying that Xi(t + 1) < Xj(t). 

This concludes the proof of Lemma C.4. □ 

Next we will prove Theorem 4 for the case of persistent disagreement, the cases of polarization 
and consensus are limiting cases of that case as b — > 1 and b — > 2 /(ho + 1) respectively. We will 
show that when 1 > b > h( ^ +1 , the value x defined in Lemma C.4(b) is a stable equilibrium. The 
other two cases can be formally proven using an argument similar to the one below. Next we will 
show that when 1 > b > h ^ +1 , the sequence {x{(t)} is bounded. 

Lemma C.5. Consider a node i € V\. Let 1 > b > feg 2 +1 . Let x € (|, 1) be the solution to (C.3). 

1. If xq < x, then for all t > 0, Xi{t) < x. 

2. If xq > x, then for all t > 0, Xi{t) > x. 

Proof of Lemma C.5. We will prove statement (1). Statement (2) can be proven using a similar 
argument. 

Proof by induction. 

Induction hypothesis: Assume that the lemma statement holds for some t > 0, i.e., assume that 
Xi(t) < x for some t > 0. 

Base case: The statement holds for t = by assumption. 
We will show that the lemma statement holds for t + 1. 

xAt + l) (xi(t)) b Si(t) (x) b Si(t) . 1 ,s + 

< / ; w (since - < s,(t) < x, and b > 0) 



1-Xi(t + 1) (1 - Xi(t)) b di - Si (t) (1 - x) b di - Si(t) v 2 

Observe that since Xi(t) < x and p s > p,i, Si(t) = n(p s Xi(t) + pd(l — < n(p s x + pd(l — x)). 
Therefore, 

Si(t) p s X+p d (l ~ X) 



di-Si(t) p s (l-x)+p d x 
As a result, 

Xi(t + 1) {x) b p s x+p d {l-x) x 

< t ttt — -, rr = (by definition of x) 



1-Xi(t+1) (1 - x) b p s (l - x) +PdX 1 — x 

This implies X{(t + 1) < x. This completes the inductive proof. □ 

Next we will prove that when 1 > b > ^q+i ^ ^ ne se q uence i x i{t)} 1S monotone. 
Lemma C.6. Consider a node i € V\. Let 1 > b > h ^ +1 ■ Let x G (5, 1) 6e the solution to (C.3). 
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1. If xq < x, the sequence {xi(t)} is strictly increasing. 

2. If xq > x, the sequence {xi(t)} is strictly decreasing. 

Proof of Lemma C.6. We will prove statement (1); statement (2) can be proven using a similar 
argument. 

Assume xq < x. Then, from Lemma C.5, we know that for all t > 0,x,(i) < x. Fix t > 0. 
Let Xi(t) = y < x. Recall that by definition of x, if Xi(t) = x, Xi(t + 1) = Xi(t). Equivalently, 
f(x;b) = ha, where / is defined by (C.4). From Proposition C.l, we know that / is strictly 
increasing over the interval {\,x). Therefore, f(y;b) < f(x;b) = ho- Equivalently, 

{y) 2 - b - (1 - y) 2 - b 

y(l_y)l-6_ 2/1-6(1 _y) < G 



Rearranging, we get 



y ( y \ {i-y) + h G y_ xi(t + i) 



l-y \l-yj y + (l-y)h G 1- Xi (t + 1) 

Equivalently, x-i(t + 1) > Xi[t). □ 

Using the fact that the sequence {xi(t)} is monotone and bounded, next we will prove that it 
converges to x. 

Lemma C.7. Consider a node % € V\. Let 1 > b > h( ^ +1 ■ Let be the solution to (C.3). 

Then, lim^oo Xi(t) = x. 

Proof. For the proof, we will assume that the initial opinion Xj(0) = xq < x. The case when xq > x 
can be argued in an analogous way. 

Observe that if xq = x, then by Lemma C.4, it follows that for all t > 0, Xi(t + 1) = x, and we 
are done. So let us assume that ^ < xq < x. From Lemma C.5 and Lemma C.6, we know that the 
sequence {xi(t)} is strictly increasing and bounded. This implies that the sequence must converge 
either to x or to some value in the interval [xo,x). Consider the function g : [0, 1] — >• M defined as 

/ x y b (h G y + (l-y)) 

g{y) ~ 



y b {h G y+{l-y) + {l-y) b (h G {l-y)+y) 
Observe that for all t > 0, x%{t + 1) — Xi(t) = g(xi(t)). Therefore, 

(a) for all y G {\,x), g(y) > (since, by Lemma C.6, the sequence {xi(t)} is strictly increasing), 
and 

(b) g(x) = (by definition of x). 

For the purpose of contradiction, assume that limi_ >00 xi (t) = a, where xq < a < x. This implies, 
for every e > 0, there exists a t(e) such that for all t > i(e), Xi(t + 1) — X{(t) < e, or equivalently, 
that for all t > t(e), g(xi(t)) < e. 

Let min yg [ I . 0ja ] g(y) = c. It implies for all y £ [xq, a], g(y) > c. From (a), it follows that c > 0. 
Setting e = c, our analysis implies the following two properties of g: (1) for all t > 0,g(xi(t)) > c, 
and (2) for all t > t(e), g(xi(t)) < c, which contradict each other. This completes the proof by 
contradiction. □ 

This completes the proof of Theorem 4. 
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D Proofs of Section 5 



Proof of Theorem 6. 

Lemma D.l. In the limit as n — >■ oo, SimpleSALSA is polarizing with respect to i if and only if i 
is biased. 

Proof. Assume without loss of generality that X{ > \. 

Let p r be the probability that SimpleSALSA recommends a RED book. The proof consists of 
two steps: first we show that p r > | and p r < Xi, and then we show that if p r > ^ and p r < Xi, 
SimpleSALSA is polarizing with respect to i if and only if i is biased. 

Pr = E P[» -4 j] 
j&V 2 :j 2 is RED 

= e m^h] E E p t^i2] E 

ji€N{i) j€V 2 32&N{i) j&V 2 

jl is RED j is RED j 2 is BLUE j is RED 

= £ liv^i £ £ ^vk £ 

jieJV(i) 1 Wl jGV 2 j 2 eJV(i) 1 Wl jev 2 

jl is RED j is RED j 2 is BLUE j is RED 

- £ ^|£^i+ £ i^r E 1*^1 

hev 2 1 Wl jev 2 j 2 &v 2 1 Wl j&v 2 

jl is RED j is RED j 2 is BLUE j is RED 

I ATM I Z> lATCi^l \N(i'~\\ + lArmi 2^ 



jl is RED jf is RED j 2 is BLUE j is RED 

EZiji sr~^ sr~^ % V i\Z V j sr-^ Zij 2 Z^^Zyj 

i rvmi 1^ 1^ \N(iA\\N(n\\ 2^ I Arm i 2^ 1^ 



jl is RED j is RED j 2 is BLUE j is RED 

By Lemma 5.1, in the limit as n — > oo, with probability 1, 

V- z ih ST^ ST^ z i'ji Z i'j 1 mk 2 {\ + Var(xi)) _ / 1 \ 

jl is RED j is RED 

and 

%i \- Zi,j 2 Zi,j 1 mfc 2 (| - Var(xi)) _ , . 

j 2 is BLUE j is RED 

Therefore, in the limit as n — > oo, with probability 1, 

Pr -»• Xi ( J + 2Var(xi) J + (1 — as*) ( ^ — 2Var(xi) 



Since X, > | (by assumption), and Var(xi) > (by assumption), we have that 

p r > - and p r < Xj (D-l) 
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First, assume that i is unbiased. Let p be the probability that i accepts the recommendation. There- 
fore, the probability that the recommended book was RED given that i accepted the recommendation 
is given by 

PrP < 
Pr — 



p r p+ (1 -p r )p 

Therefore, SimpleSALSA is not polarizing. 

Now, assume that i is biased. This implies i accepts the recommendation of a RED book with 
probability X{ and that of a BLUE book with probability 1 — x^. Therefore, the probability that the 
recommended book was RED given that i accepted the recommendation is given by 

Pr%i Pr%i , . 1 . . v. 

> — r (since Pr > -, from (D.l)J = cc. 



p r Xi + (I- Xi)(l- p r ) p r Xi + p r (l - Xi) ' 2" 

Therefore, by definition, SimpleSALSA is polarizing. □ 

Lemma D.2. In the limit as n — >■ oo and as T — ^ oo, SimplelCF is polarizing with respect to i if 
and only if i is biased. 

Proof. Assume without loss of generality that Xi > \. 

Let p r be the probability that SimplelCF recommends a RED book. For a node j G N(i), let 
QjRED be the probability that after T two-step random walks starting at j, the node with the largest 
value of count ( j ) , i.e., j* , is RED, and (/-/blue be the corresponding probability that j* is BLUE. Then, 

p r = ^ f[i ->■ JifeRED + F [i -> htehMD 

heN{i) j'2SJV(i) 

jl is RED ]2 is BLUE 

= E \w{{j\ qj ^ + E pvW feRED 

h&N(i) 1 Wl j 2 eN(i) 1 Wl 

jl is RED j 2 is BLUE 

J2&V2 

jl is RED ji is BLUE 

Consider T two-step random walks starting at a node j\ G N(i). Observe that red is exactly the 
probability that after these T random walks, there exists a RED node, say j, such that count (j) > 
count (j ') for all BLUE nodes j' . However, as T — > oo, 

P[for all BLUE books f G V 2 , count (j) > count (j ')] = P[for all BLUE books j G V 2 , P[ji 4 j] > P[ji 4 f] 

since as T — ^ oo, count (j) — > T • P[ji — > j] (by the Strong Law of Large Numbers). Therefore, 

q jlRED = P[for all BLUE books f G V 2 , P[ji 4 j] > P[jx 4 j']] 
Observe that for two RED books ji and j, 

By Lemma 5.1, in the limit as n — > oo, with probability 1, 

pr . 2 1 1 m^d+Var^)) 1/1 , 

fc mk/zn n z n \2 
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Similarly, for a BLUE book j' , in the limit as n — > oo, with probability 1, 

pr . 2^ .,, 1 1 mk 2 (l-V a r( Xl )) 1/1 

Pji ->JH J—n^ — 2 = ~ o ~ 2Vax(xi 

k ink/ 2n n z n \2 

2 2 

Since Var(xi) > 0, in the limit as n — > oo, P[ji — > j] > P[j'i — > j ] with probability 1. Therefore, 
QjiRED = 1- By symmetry (j^red = 1 — <Zj2blue = 0. Moreover, by Lemma 5.1, in the limit as 
n — > oo, iieVi ijvffll = Xi > w ^h probability 1. Therefore, as n — >• oo, 

jl is RED 

= %i (D.2) 

The rest of the analysis is identical to Lemma D.l. □ 

This completes the proof of Theorem 6. □ 

Proof of Theorem 5. Assume, without loss of generality, that x% > \. 

Let p r be the probability that SimplePPR recommends a RED book to i. This probability is 
exactly equal to the probability that after T three-step random walks starting at i there exists a 
RED node, say j, such that such that count (j) > count (j') for all BLUE nodes j'. However, as 

T — >• oo, 

P[for all BLUE books f G V 2 , count (j) > count (j ' )] = P[for all BLUE books j € V 2 , P[i 4 j] > P[i 4 j'}] 

since as T — > oo, count ( j ) — >■ T-P[i — > j] with probability 1 (by the Strong Law of Large Numbers). 
Therefore, 



p r = P[for all BLUE books j' G V 2 , P[i A j] > P[i 4 j'} 



For a RED book j eV 2 , 



jl is RED j 2 is BLUE 

[■^1= E p4 Pl " 4;1 + E 

jieJV(i) 1 Wl j 2 eiV(i) 1 Wl 

jl is RED j 2 is BLUE 



J16V2 32&V2 

jl is RED jf 2 is BLUE 

As we showed in the proof of Lemma D.2, in the limit as n — > 00, 

lP[ji ^ j] - ( r + 2Var(xi) J and (by symmetry) P[j 2 A- J ] ->■ - ( - - 2Var(iCi; 
n \2 / ra \2 

with probability 1. Moreover, by Lemma 5.1, in the limit as n — > 00, ^2 jiev 2 \N(i)\ ~~ ^ Xi > w *th 

jl is RED 

probability 1. Therefore, with probability 1, 

p [« A i] -> - f J + 2Var(xi)') + (\ - 2Var(a; 1 ) 

n \ 2 I n \ 2 
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Similarly, for a BLUE book j' G V2, in the limit as n — > 00, with probability 1, 
P[i 4 /] -> ^ Q - 2Var(s 1 )) + Q + 2Var(* 1 ) 

Since > \ and Var(xi) > 0, 

with probability 1. In other words, p r = 1. So, the probability that a book recommended by 
SimplePPR was RED given that it was accepted is exactly p r regardless of whether i is biased or 
unbiased. Therefore, SimplePPR is polarizing. 

□ 
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